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AUTOMORPHISM GROUPS OF GAUSSIAN CHAIN GRAPH 

MODELS 

JAN DRAISMA AND PIOTR ZWIERNIK 


Abstract. In this paper we extend earlier work on groups acting on Gaussian 
graphical models to Gaussian Bayesian networks and more general Gaussian models 
defined by chain graphs. We discuss the maximal group which leaves a given model 
invariant and provide basic statistical applications of this result. This includes 
equivariant estimation, maximal invariants and robustness. The computation of 
the group requires finding the essential graph. However, by applying Studeny’s 
theory of imsets we show that computations for DAGs can be performed efficiently 
without building the essential graph. In our proof we derive simple necessary and 
sufficient conditions on vanishing sub-minors of the concentration matrix in the 
model. 


1. Introduction 

Having an explicit group action on a parametric statistical model gives a better 
understanding of equivariant estimation or invariant testing for the model under 
consideration [BNBJJ82, Eat89, LR05, SS05]. In [DKZ13] we have identified the 
largest group that acts on an undirected Gaussian graphical model and we have 
shown how this group can be used to study equivariant estimators of the covariance 
matrix in this model class. In the present paper we extend our discussion to chain 
graph models. 

A chain graph % is a graph with both directed and undirected edges that contains 
no semi-directed cycles, that is sequences of nodes ii,... ,ikflk +1 = H such that for 
every j = 1,..., k either i 3 — or i 3 -A i 3+ 1 but at least one edge is directed. In 
this paper we focus on chain graphs without flags (NF-CGs), that is with no induced 
subgraphs of the form i -A j — k. Note that both undirected graphs and directed 
acyclic graphs (DAGs) are chain graphs without flags. For more details on these 
graph-theoretic notions see Section 2.1. 
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Gaussian models on chain graphs constitute a flexible family of graphical models, 
which contains both undirected Gaussian graphical models and Gaussian Bayesian 
networks defined by directed acyclic graphs (DAGs). Let RL be a NF-CG. Let MR 1 
denote the space of all m x m matrices A = [Ay] such that Ay = 0 if i j in RL: let 
S '+ denote the space of all m x m symmetric positive de fini te matrices and let S ^ 
be the subspace of <S+ consisting of matrices hi = [uj t j] such that cuy = 0 if i ^ j and 
i -J-j in T-L. The Gaussian chain graph model M(RL) of a NF-CG RL consists of all 
Gaussian distributions on M m with mean zero and concentration matrices K of the 
form 

(1) K = (I - A)fi(J - A t ) such that A G R H , Q G «S+. 

The set of all matrices of this form will be denoted by KRRL). 

Remark 1.1. Two non-equivalent definitions of chain graph models can be found 
in the literature and they are referred to as LWF or AMP chain graph models 
in [AMP01], which refers to: Lauritzen-Wermuth-Frydenberg [Fry90, LW89] and 
Andersson-Madigan-Perlman [AMP01] (Alternative Markov Properties). These two 
definitions differ in how exactly a graph encodes the defining set of conditional inde¬ 
pendence statements. However, if RL has no flags then both definitions coincide (see 
[AMP01, Theorem 1, Theorem 4]). 

Let X be a Gaussian vector with the covariance matrix E G M(RL). A linear 
transformation g G GL m (M) yields another Gaussian vector Y = gX. A basic 
question of equivariant inference is for which g the covariance matrix gT,g T of Y 
still lies in M(RL). More formally, the general linear group GL m (M) acts on 5+ by 
g ■ E := gT,g T . Fix a chain graph RL. We study the problem of finding: 

(2) G := {g G GL m (R)| g ■ M(RL) C M(RL)}. 

In other words, find the stabilizer of M(RL ) in GL m (M). 

Remark 1.2. The set G is a closed algebraic subgroup of GL m (M), and in particular 
has the structure of a Lie group. First, it is clear from the definition that G is closed 
under matrix multiplication. To see that it is closed under inversion and closed in the 
Zariski topology, we argue as follows. Let M{RL) denote the Zariski closure of M{RL ) 
in M mxm , that is, the set of matrices in M mxm whose entries satisfy all polynomial 
equations that hold identically on M(RL). Suppose that g G GL m (M) maps M(RL ) into 
itself. Then, since acting with g preserves positive definite matrices and since M(RL ) 
consists of all positive definite matrices in M(RL) (see [DSS09, Proposition 3.3.13] for 
the case of DAGs; the general chain graph case is similar), g also preserves M(RL). 
Thus G may be characterized as the stabilizer of the real algebraic variety M(RL). 
This shows that G is Zariski closed. To see that it is also closed under inversion, 
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note that g ■ M{Ti) is a real algebraic variety of the same dimension as M{Ti) and 
contained in M(Ti), hence equal to M(Ti). But then also g~ 1 {M{Ti)) equals M(Ti ). 

The problem in (2) can be alternatively phrased in terms of concentration matrices, 
which will be more useful in our case. Let GL m (M) act on by g ■ K := g~ T Kg l . 
Now find all g G GL m (M) such that g ■ 1C(TT) C /C (Ti). 


12 3 

Example 1.3. The DAG defines a model given by a single conditional 

independence statement X\ _LL A 3 1 X- 2 and hence is equal to the model on the undi- 

12 3 

rected graph • — • — Since the directed part of this graph is empty then by (1) the 
model consists of all covariance matrices such that the corresponding concentration 
matrices are of the form 


K = Q 


* * 0 

* * * 

0 * * 


By [DKZ13, Theorem 1.1] G in this case consists of invertible matrices of the form 
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1.1. The group G. Example 1.3 showed that two different chain graphs may define 

the same chain graph model. We discuss this in more detail in Section 2.2. For any 

NF-CG Ti denote by Ti* the unique graph without flags with the largest number of 

undirected edges which induces the same Gaussian model as Ti. The fact that such 

a unique graph exists follows from Proposition 2.10 given later. For example for the 

123 

DAG in Example 1.3 such a graph is given by the undirected graph • — • — By 
c*(i) we denote the children of i in Ti*, so c*(i) — {j : i —> j in Ti*}. Similarly by 
n*(i) we denote the set of neighbours of i in Ti*, that is, nodes j connected to i by 
an undirected edge, which we denote by i — j. We write 


N*(i) := {i}Un*(i)Uc*(j). 

Our main results can be summarized as follows. For a fixed chain graph without 
flags Ti with set of nodes given by [m] := { 1 ,..., m} consider the set G° of invertible 
matrices given by 

(3) G° := {g = [ 9ij ] G GL m (R) : 9ij = 0 if N*(i) £ N*(j)}. 

Further, an automorphism of a chain graph is any permutation of its nodes that 
maps directed edges to directed edges and undirected edges to undirected edges. 
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Theorem 1.4. Let 'LL be a chain graph without flags. The group G in (2) is generated 
by its connected normal subgroup G° and the group Aut("H*) of automorphisms of 
the essential graph hi*. 

In the undirected case, this theorem reduces to [DKZ13, Theorem 1.1], However, 
the proof in our current, more general setting is much more involved, first because 
the set KflfH) is not a linear space, and second because the characterization is in 
terms of the essential graph rather than the graph itself. 

Note that for some graphs there may be two nodes i,j such that N*(i) = N*(j ). 
In this case the transposition of i and j lies already in G°, which shows that G° and 
Aut('H*) may have a non-trivial intersection. In Section 4 we prove a more refined 
version of Theorem 1.4 that gets rid of this redundancy. 

Given a set of edges defining a chain graph without flags hi we would like to find 
G° by listing all pairs ( i,j ) for i,j e [m] such that g l3 = 0 for all g e G°. Since 
our theorem depends on computing the essential graph hi*, a natural question arises 
on complexity of this computation. In Section 5 we show how G° can be efficiently 
computed in the case of DAGs. We propose an efficient algorithm that does not 
require computing the essential graph hi*. 

1.2. Existence and robustness of equivariant estimators. The description of 
the group G can be used to analyse the inference for chain graph models. Let 
X = (Xi,. . . , X n ) denote a random sample of length n from the model M(hi). An 
estimator of the covariance matrix of X is any map T n : (M m ) n — y AT {hi). In this 
paper we are interested in equivariant estimators, that is, estimators satisfying 

(4) T n {g -X) = g- T(X) for every geG,Xe (R m )", 

where the action of G on (R m ) n is 

g • (xi, ...,x n ) = (gx u .. .,gx n ). 

An important example of an equivariant estimator is the maximum likelihood esti¬ 
mator. A natural theoretical question is how large the sample size needs to be so 
that an equivariant estimator T n exists with probability one (see [DKZ13, Section 
1.2]). Define fi := {j : N*{i) C N*(j)}. 

Theorem 1.5. Let hi be a chain graph without flags with set of nodes [m\. There 
exists a G-equivariant estimator T n : (M m ) n — > M(hi) of the covariance matrix of X 
in the model Mflhi ) if and only 

n > max|4T|. 

ie[m] 

Our next result is the formula for the maximal invariant (see [LR05, Section 6.2], 
[DKZ13, Section 1.3]). It uses the equivalence relation ~ on [m] defined by i ~ j 
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if and only if N*(i) = N*(j). We write i for the equivalence class of i £ [m] and 
[m]/ ~ for the set of all equivalence classes. 

Theorem 1.6. Let Li be a chain graph without flags. Suppose that n > max* |jA|. 
Then the map r : M mx?l —> riieH/~® n ' ><n fl ven by 

x (x[|i] T (x[|z]x[|z] T )- 1 x[|i]). gH/ ^ , 

where x[|z] £ M^ xn is the submatrix of x given by all rows indexed by fi, is a 
maximal G°-invariant. 

12 3 

Example 1.7. Consider the model defined by Then j,l = {1}, = {3} 

and 12 = {1, 2, 3}. This graph is essential and the corresponding maximal invariant 
statistic is 

( x [ 1 ] r (x[ 1 ] x [ 1 ] T )- 1 x[ 1 ], x r (xx r )' lx , x [3] t (x[3]x[3] t ) _1 x[3]), 

where x £ M 3xn is a matrix whose columns are data points, and x[z] denotes the i-th 
row of this matrix. Here ^x[i]x[i] T is just the sample variance of X % . 

In [DKZ13] we also used the structure of the group to provide non-trivial bounds 
on the finite sample breakdown point for all equivariant estimators of the covariance 
matrix for undirected Gaussian graphical models. These results extends to chain 
graphs without flags. 

Proposition 1.8. Assume thatn > max, |JT|. Then for any G-equivariant estimator 
T : M mxn —y Sg the finite sample breakdown point at a generic sample x is at most 
\{n — max, |jT| + l)/2]/n. 

Unlike the proof of Theorem 1.4, the proofs of Theorem 1.5, Theorem 1.6 and 
Proposition 1.8 are similar to the undirected case because they depend on G only 
through the induced poset defined by the ordering relation N*(i ) C N*(j), which 
drives the zero pattern of the group G°. The proofs of these three results will be 
therefore omitted, see [DKZ13] for details. 

Organization of the paper. In Section 2.1 we provide some basic graph-theoretical 
definitions. The theory of Markov equivalence of chain graphs will be discussed in 
Section 2.2. In Section 3 we provide new results that give necessary and sufficient 
vanishing conditions for sub determinants of the concentration matrix K £ KflfH). In 
Section 4 we analyze the structure of the group G in order to prove Theorem 1.4. In 
Section 5 we show that in the case of DAG models, structural irnsets give us all the 
required information to identify G without constructing the essential graphs. Section 
6 contains some simple examples of Theorem 1.4. 
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2. Preliminaries 

In this section we discuss basic notions of the theory of chain graphs and chain 
graph models. 

2.1. Basics of chain graphs. Let ki be a hybrid graph, that is a graph with both 
directed and undirected edges, but neither loops nor multiple edges. This excludes 
also a situation when two nodes are connected by an undirected and a directed edge. 
We assume that the set of nodes of ki is labelled with [m] = {1 ,..., m}. A directed 
edge (arrow) from i to j is denoted by i —> j and an undirected edge between i and 
j is denoted by i — j. We write i- ■ ■ j, and say that i and j are linked, whenever we 
mean that either i —» j or i 4— j, or i — j. 

An undirected path between i and j in a hybrid graph ki is any sequence ki,, k n 
of nodes such that k\ — i, k n — j and ki — k l+ \ in ki for every i = 1,... ,n — 1. 
A semi-directed path between i and j is any sequence k\,... ,k n of nodes such that 
k\ = i, k n = j and either ki — k t+ \ or ki —* ki + 1 in ki for every i — 1 ,..., n — 1 and 
ki —* k i+ 1 for at least one i. A directed path between i and j in a hybrid graph ki 
is any sequence k\,... ,k n of nodes such that k\ = i, k n = j and ki —> k i+ 1 in ki 
for every i = 1 ,..., n — 1. A semi-directed cycle in a hybrid graph ki is a sequence 
k\,..., k n+ 1 = ki, n > 3 of nodes in kt such that k\, ... , k n are distinct, and this 
sequence forms a semi-directed path. In a similar way we define a undirected cycle 
and directed cycle. 

Definition 2 . 1 . A chain graph (or CG) is a hybrid graph without semi-directed 
cycles. 

A set of nodes T is connected in kt, if for every i,j G T there exists an undirected 
path between i and j. Maximal connected subsets in ki with respect to set inclusion 
are called comporients in ki. The class of components of ki is denoted by 3k {ki). The 
elements of 3k{ki) form a partition of the set of nodes of ki. For any subset A C [m] 
of the set of vertices we define the induced graph on A, denoted by "Ha, as the graph 
with set of nodes A and for any two j, j 6 A we have i —> j, j —> i or i — j if and 
only if i —> j, j —> i or i — j in ki, respectively. 

Define the set of parents of A C [m], denoted by p^{A), as the set of i G [m] such 
that i a in ki for some a G A. The set of children c-^(A) is the set of i G [m] such 
that a i in ki for some a G A; and the set of neighbors ny_{A) is the set of all 
i G [m] such that i — a in ki for some a G A. In addition we define 

A r n (i) ■= {*} U n n {i ) U c n {i). 

If C is a connected set in a chain graph ki, then there are no arrows between 
elements in C, for otherwise there would exist a semi-directed cycle. In particular, 
the induced graph kic on C is an undirected graph and Pu(C) is disjoint from C for 
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any C G 17(H). In addition, for every A C [m] the induced subgraph Ha of a chain 
graph H is a chain graph itself. A clique in an undirected graph is a subset of nodes 
such that any two nodes are linked. We say that a clique is maximal if it is maximal 
with respect to inclusion. 

Definition 2.2. For any CG H an immorality is any induced subgraph of H of the 
form i — > j G- k. A flag is any induced subgraph of the form i —>■ j — k. A chain 
graph without flags is abbreviated by NF-CG. 

Undirected graphs and DAGs are chain graphs without flags. We often use the 
following basic fact. 

Lemma 2.3. If H is a NF-CG then pn(A) = Pu(T ) for every T G 17(H) and non¬ 
empty ACT. In particular for any two i,j G [m] such that i — j in H we have 
Puli) =Pn{j )• 

Definition 2.4. Let H be a chain graph. For any two distinct components T, T' G 
17(H) consider the set of all arrows between T and T'. If this set is non-empty then 
we call it a meta-arrow and denote by T =>■ T'. That is 

T =>■ T' := {i ->■ j : i G T, j G T', i -A j in H}. 

The notion of meta-arrow is important in the considerations of equivalence classes 
of chain graphs, which we discuss in the next section. 

2.2. Equivalence classes of chain graphs. A chain graph model is given by all 
concentration matrices of the form (1). In Example 1.3 we saw that two different 
chain graphs may give the same Gaussian models or equivalently the same set of 
conditional independence statements. If two NF-CGs Q and H define the same chain 
graph model, we say that they are graph equivalent (or simply equivalent). For 
example the three DAGs in Figure 1 are equivalent. 

123 123 123 

•-•-«-•-*>• -• 

Figure 1. Three equivalent DAGs. 

The equivalence class of H in the set of NF-CGs is denoted by (H): 

(H) = {Q : Q is a NF-CG graph equivalent to H}. 

Equivalence of CGs and DAGs was discussed in many papers, for example [AMP97, 
Fry90, Rov05, VP91]. We briefly list the most relevant results. 

Definition 2.5. The skeleton of a chain graph H is the undirected graph such that 
i — j whenever i ■ ■ ■ j in H. 
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Theorem 2.6. Two NF-CGs with the same set of nodes are equivalent if and only 
if they have the same skeleton and the same immoralities. 

The original statement of this result, given by Frydenberg in [Fry90], is more 
general and applies to any chain graph in the LWF definition of chain graph models. 

As was remarked in [Rov05] considering meta-arrows helps to understand equiva¬ 
lence classes of chain graphs. Suppose that we want to obtain one chain graph from 
another with the same skeleton by changing some of the arrows i — > j to i — j or 
i G- j. Changing only a subset of arrows in a meta-arrow T =>■ T' is not permitted 
as it would introduce semi-directed cycles. Hence the only permitted operations on 
arrows of TL, if we work in the class of CGs, is either changing the directions of all 
the elements of T =>■ T' or changing all arrows of T =>■ T' into undirected edges. The 
following basic operation on a chain graph was defined in [Rov05, Stu04a]. 

Definition 2.7. Let TL be a NF-CG and let T =>■ T' be a meta-arrow in TL where 
T, T' G FT{TL). Merging of T and T' is an operation of changing all elements of the 
meta-arrow T =>■ T' into undirected edges. Merging is called legal if 

(a) Ph{T') flT is a clique of T; 

(b) Ph (T)\T = Ph (T). 

Lemma 2.8. Let TL be a NF-CG and let TL' be a graph obtained from TL by legal 
merging of two connected components. Then TL' G {TL). 

Proof. See for example the proof of Lemma 22 in [SRS09]. □ 

For two distinct CGs Q, TL with the same skeleton we write Q C TL if, whenever 
i — > j in Q, then either % —> j or i — j in TL, and whenever i — j in Q, then i — j in 
TL. We write Q C TL if Q C TL and Q ^ TL. 

Theorem 2.9 (Roverato,Studeny [Rov05, Stu04a]). Let Q and TL be two equivalent 
NF-CGs such that Q C TL. Then there exists a finite sequence Q = Go C • • • C Q r = 
TL, with r > 1, of equivalent NF-CGs such that, for all i — 1,..., r Qi can be obtained 
from Qi-\ by a legal merging of two connected components of G t -i- 

By the following proposition there is always a unique NF-CG representing {TL) 
with the largest number of undirected edges. 

Proposition 2.10 (Roverato,Studeny [Rov05, Stu04a]). There exists a unique ele¬ 
ment TL* in {TL) that is maximal in the sense that TL' C TL* for every TL' G {TL). 

Definition 2.11. Let TL be a NF-CG. The graph TL* of Proposition 2.10 is called 
the essential graph. The directed arrows in TL* are called essential. For notational 
convenience we write p*{A), n*(A ) and c*{A) for py_*{A), nn*{A) and c%*{A) respec¬ 
tively. 


GROUPS OF GAUSSIAN CHAIN GRAPH MODELS 


9 


By definition R* has the same skeleton as R, and an edge is essential if and only 
if it occurs as an arrow with the same orientation in every R' G (R)', all other edges 
are undirected. For example, the essential graph for any of the graphs in Figure 1 

is the undirected graph • — • — whereas the essential graph oiR — • —• 
is R itself. By Theorem 2.6, every arrow that participates in an immorality in R 
is essential, but R may contain other essential arrows. For example, in the DAG in 
Figure 2 all arrows are essential but not all of them form immoralities. 


2 4 



1 3 


Figure 2. A NF-CG whose arrows are all essential but not all part 
of immoralities. 

The following result has been independently observed in [Rov05, Stu04a]. 

Theorem 2.12. If (R) contains a DAG Q, then the essential graph R* is equal to 
the essential graph of a DAG as defined in [AMP97]. 

Remark 2.13. Our terminology is consistent with [Rov05]. However, in [AP06] the 
essential graph for a chain graph is defined in a different way and it corresponds to 
the essential graph R* only if (R) contains a DAG. 

3. Subdeterminants of concentration matrices 

Let R be any chain graph on [m\. We want to determine which sub-determinants 
of the concentration matrix of the corresponding model are identically zero on the 
model. This provides simple necessary conditions for a concentration matrix to lie 
in )C(R). We will use the following combinatorial notions. 

Definition 3.1. A cup in R is a quadruple (i, j, k, l ) of vertices in R where 

(1) either i — j or i —> j; and 

(2) either j = k or j — k\ and 

(3) either k — l or k 4— l. 

We say that the cup starts in i and ends in l. 

Definition 3.2. Let A and B be sets of vertices of R of the same cardinality d. A 
cup system from A to B is a set U of d cups in R whose starting points exhaust A 
and whose end points exhaust B. The cup system U from A to B gives rise to a 
bijection A —>■ B that sends a G A to the end point of the cup in U that starts with 
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a. After fixing labellings A = {ai,..., a^} and B = {6 l5 ..., this bijection gives 
rise to a permutation of [d]\ define sgn (U) to be the sign of this permutation. The 
cup system U from A to B is said to be self-avoiding if for each k = 1, 2, 3, 4 the 
elements Uk £ [m] of u — (u\,U 2 ,Uz,uf) £ U are all distinct. 

12 3 

For the graph there is no self-avoiding cup system from {1, 2} to {2, 3} 

but there is such a system between {1} and {3}. 

Definition 3.3. Let X tJ be the parameters corresponding to arrows i —> j in T~l and 
let ojij be the parameters corresponding to undirected edges i—j and to the diagonal 
(c on). The weight of a cup (■ i,j , k, l ) in % is the product of the (i,j) entry of (/ — A), 
the ( j , fc)-entry of 0, and the (k, Z)-entry of (/ — A) T , which is the (/, fc)-entry of 
(/ — A). The weight of a cup system U from A to B, denoted w(U), is the product 
of the weights of the cups in U. This is a monomial of degree k in the uiij times a 
monomial of degree at most k in the variables —\j. 

Let K[A, B] denote the A x 5-submatrix of K = (/ — A)fi(J — A) 1 . By expanding 
the entries, we find that 

(5) det K[A, B] = J2 sgn (U)w(U), 

u 

where the sum is over all cup systems U from AtoB. In this expression cancellation 
can occur because of the signs sgn (U) (not because of the signs in the — Ay, which 
we might as well have taken as new variables). The following proposition captures 
exactly which terms cancel. For more details on the arguments, we refer to [STD10, 
DST13], 

Proposition 3.4. Relative to the fixed labellings of A and B, the Ax B-subdeterminant 
of K equals 

det K[A,B]= sgn (U)w(U). 

U self-avoiding 

Moreover, for any two self-avoiding cup systems U and U' with w(U ) = w(U') we 
have sgn ( U) = sgn (U'). 

Proof. To see that the sum in (5) can be restricted to self-avoiding cup systems U , 
we proceed as in the Lindstrom-Gessel-Viennot lemma [GV89, Theorem 1] and give 
a sign-reversing involution a on the set of non-self-avoiding cup systems, as follows. 
Order any cup system U from A to B as {wi,... ,Ud} where Ui starts in a,. If U is 
not self-avoiding, let a £ {2,3} be minimal such that the entries Ui a ,i £ [d\ are not 
all distinct, and let (i,i') be a lexicographically minimal pair such that Ui a = Uj/ a . 
Then a(U) is the cup system obtained from U by replacing Ui and by their 
swaps at position a. For instance, if a = 2, then u[ = (n,;i, u l2 = IV 2 , ip' 3 , iv 4 ) and 
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u[, = (ui'i,Ui '2 = u l2 , Ui 3 , Wj 4 ); and similarly for a = 3. Now sgn ([/') = —sgn (U) and 
cr is indeed an involution. This proves the expression in the proposition. The second 
statement is more subtle, but it follows by applying [DST13, Theorem 3.3] to the 
DAG obtained from TL by reversing all arrows and replacing all undirected edges i—j 
by a pair i •<— k —> j of arrows, where A; is a new vertex. Indeed, self-avoiding cup 
systems in TL correspond to special types of trek systems without sided intersection 
in that new graph. □ 

Note that the set of covariance matrices in the model is captured by which sub¬ 
determinants vanish identically — indeed, the conditional independence statements 
already suffice for this, and they are determinants (see for example [DSS09, Proposi¬ 
tion 3.1.13]) — but we do not know if this is true for the set of concentration matrices 
as well. Therefore, Proposition 3.4 may well have other statistical applications, but 
in what follows, we will mostly use the following direct consequence. 

Corollary 3.5. The subdeterminant det K[A,B] is identically zero on the model 
corresponding to TL if and only if there does not exist a self-avoiding cup system from 
A to B inTL. 

In the next section we begin our analysis of the group G , defined in (2), with a 
study of its connected component of the identity. 

4. The group G 

4.1. The connected component of the identity. Denote by Eij the matrix in 
M mxm with all entries zero apart from the (i, j)-th element which is 1. By G° denote 
the normal subgroup of G which forms the connected component of the identity 
matrix. The subgroup T m of all diagonal and invertible matrices is contained in the 
group G because scaling of vector X does not affect conditional independencies. By 
[DKZ13, Lemma 2.1], to compute G°, it suffices to check for which (i,j) G [m] x [m] 
the one-parameter groups (/ + tE ^), t G R, lie in G ; or equivalently G g, where 
g is the Lie algebra of G. 

Before we provide the main result of this section we recall [DKZ13, Proposition 

2 . 2 ], 

Proposition 4.1 (Proposition 2.2, [DKZ13]). Let TL be an undirected graph. For 
i,j G [m] the matrix E^ lies in g if and only if N n (i) C N H (j). 

If TL is a NF-CG such that TL* is an undirected graph then Proposition 4.1 can be 
used to characterize G° for TL by passing to the essential graph. However, it is not 
immediately clear how this result extends to all chain graphs without flags. We first 
note that one direction of the above result holds in general. 

Lemma 4.2. Let TL be an NF-CG. If Ny^(i) C N^{j), then E tJ G g. 
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Proof. If i = j then the statement is clear so suppose that i ^ j. We have N n (i) C 
N H (j) only if either j —> i or i — j in Pi. Suppose first that j —¥ i. We have 

(/ - tEji)(I - A )0(/ - A) t (J - tEij) = (I- A )0(/ - A) r , 

where A = —A — tEji + tE ri A'. X uv = X uv if u j- j\ Xj V = X p , — tXi V if u 7 - i; and 
Xji — A ji + 1. The fact that A lies in Mf 1 follows from cu(i) C c^(j) and hence for 
every v if X JV = 0 then X lv — 0. 

If i — j in PL then i U n^i) C j U n-n(j) and pn(i) = PnU) by Lemma 2.3. By 
Proposition 4.1 applied to the undirected part of Pi we can write hi = (/ + tEj i )fl(I + 
tE^) for some hi G S^. Therefore 

(I—tE ji )(I—A)Cl(I—A) T (I—tE i j) = (I-tE ji ){I-A)(I+tE ji )n(I+tE ij ){I-A) T (I-tE ij ), 

where we now show that there exists A G such that 

(/ - tEji){I -A )(/ + tEji) = (/ - A). 

Indeed, 

A = A + tAEji - tEjiA + t 2 EjiAEji 

where the last term must vanish because A ij = 0. Hence A is obtained from A by 
adding a multiple of the j-th column to the f-tli column and by adding a multiple of 
the i-th row to the j'-th row. The fact that A lies in M. n follows from the fact that 
c h{^) c hU) an d Pn(i) — Pn{j), that is, the i-th column has the same support as 
the j-th column and the support of the i-th row is contained in the support of the 
j-th row. □ 

The converse of the lemma does not hold for general NF-CG Pi. Consider for 

instance By Example 1.3, the element / + tE \-2 lies in G° but {1,2} <2 

{2, 3}. Nevertheless, the converse of the lemma above does hold when PL is essential; 
this is the main result of this section. 

Theorem 4.3. Let PL be an essential NF-CG. Then Eij G g if and only if N^(i) C 

NnU)- 

The proof is moved to the Appendix. 

As we noted in the beginning of this section, the set of all E^ G g gives already 
the complete information on the group G°. Hence Theorem 4.3 gives the description 
of G° in (3). 

Example 4.4. Consider a DAG Pi = • then A^(2) C 1) fl A^(3). 

Hence both E 2 \ and E 23 lie in g but no other off-diagonal elements of matrices in G° 
can be non-zero. 


GROUPS OF GAUSSIAN CHAIN GRAPH MODELS 


13 


4.2. The component group. Note that G° given in Theorem 4.3 in general is not 

12 3 

the whole group G. For example both for the model and for any of the 

equivalent DAGs in Figure 1 the permutation matrix 

' 0 0 1 ' 

0 1 0 
1 0 0 

lies in G but not in G°. The following result shows that permutation matrices form 
the basis for understanding the remaining part of the group G. For the proof see 
[DKZ13, Proposition 2.5]. 

Proposition 4.5. Every element g G G can be written as g = crg 0 , where go G G° 
and a is a permutation matrix contained in G. 

An automorphism of a hybrid graph is any bijection a : [m\ —> [m] of its nodes 
such that for every i,j G [m] we have a(i) — a(j) if and only if i — j and a(i) —> a(j ) 
if and only if i —> j. 

Lemma 4.6. Let Li be a NF-CG and Li* its essential graph. Let a G GL m (M) be a 
permutation matrix. Then a G G if and only if a is an automorphism of Li*. 

Proof. The model M(Li) is uniquely defined by the set of conditional independence 
statements (see for example [Lau96]). Given a set of such statements that come from 
a chain graph Li the equivalence class (Li) is determined uniquely. The essential 
graph Li* is the unique representative of (Li) with the largest number of undirected 
edges. Since any permutation a applied to Li* gives a NF-CG with the same number 
of undirected and directed edges (it simply relabels the nodes), a lies in the model 
if and only if a is an automorphism of Li*. □ 

By Lemma 4.6 we can conclude that G is generated by G° and the automorphism 
group of Li*, which proves Theorem 1.4. 

Define an equivalence relation on [m] by i ~ j whenever N*(i) = N*(j ). For 
12 12 

example if Li — • —> • then Li* — • — • and hence 1 ~ 2 . The equivalence class of 
i G [m] is denoted by i. 

As explained in the introduction, the expression G = Aut (Li*)G° is not minimal 
in the sense that Aut(PG) and G° may intersect. To get rid of that intersection, we 
define Li* to be the graph with vertex set [m]/ ~ and i —> j (■ i—j ) in Li* if and only 
if i —> j (i — j ) in Li. We first show that Li* is well defined. 

Lemma 4.7. Let Li be a NF-CG and Li* its essential graph. Two elements i, j G [m] 
are equivalent if and only if {i}Un*(i) = {j}Gn*(j), p*(i ) = p*(j ) and c*(i ) = c*(j). 
In particular the graph Li* is well-defined. 
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Proof. If N*(i) = N*(j ) then i and j are necessarily linked. Since i G N*(j) and 
j G N*(j) we conclude that in fact % — j in PL*. By Lemma 2.3, since i — j, we also 
have p*(i) = p*(j ). This shows that i ~ j if and only if {i} U n*(i) = {j} U n*(j), 
c*(i) = c*(j ) and p*(i) = p*(j), which shows that the definition of the arrows and 
edges in PL is independent of the representative i and j. □ 

Define c : [m]/ > N, « G |i| and view c as a coloring of the vertices of PL* by 

natural numbers. Let Aut("H*, c) denote the group of automorphisms of PL* preserv¬ 
ing the coloring. There is a lifting £ : Ant (PL*, c) —> Aut (PL*) defined as follows: the 
element r G Aut("H*, c) is mapped to the unique bijection £(r) : [m] —* [m] that maps 
each equivalence class i to the equivalence class r(i) by sending the fc-th smallest 
element of i (in the natural linear order on [m]) to the k-th smallest element of r(i), 
for k — 1 ,..., |i|. 

Example 4.8. Consider a DAG PL and its essential graph PL* in Figure 3. Since 3 

12 3 4 

and 4 are equivalent, the induced essential graph PL* is equal to • — • — • . There 
are no non-trivial automorphisms of this graph preserving cardinality of equivalence 
classes and Aut (PL*,c) = {/}. In particular £ is a trivial mapping. 


3 



3 



Figure 3. On the left a DAG on four nodes. On the right its essential 
graph. 

Theorem 4.9. The group G equals £( Aut (PL*, c))G°, and the intersection £ (Ant (PL*, c))P\ 
G° is trivial, so G is the semidirect product £ (Ant (PL*, c)) x G°. 

Proof. It is a standard result from the Lie group theory that the connected component 
of the identity G° is a normal subgroup of G. Hence, to show that G = G° x 
Aut("H*, c) we need to show that G = G° ■ Ant(PL*, c ) and G°flAut('H*, c) = {/}. The 
first part follows by Proposition 4.5 and Lemma 4.6. To show that G°nAut('H*, c) = 

{/} note that transpositions of i and j lie in G° precisely when i and j are equivalent 
and hence, when they do not lie in £ (Ant (PL*, c)). □ 
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Remark 4.10. To the coloured graph (' H*,c ) we can associate a Gaussian graphical 
model M(TL, c) with multivariate nodes , where node i is associated to a Gaussian 
vector of dimension cp This model coincides with M(TL). This also shows, conversely, 
that our framework extends to general Gaussian graphical models of chain graphs 
with no flags with multivariate nodes. 

Computing the essential graph TL* is not always a simple task. In Section 5 we 
show how to identify the group G without finding 1~L* in the case when TL is a DAG. 
In the next section we illustrate Theorem 4.9 with some basic examples. 

5. Efficient computations for DAG models 

In this section we present some efficient techniques for computing the group G° 
in the case when "H is a DAG. The following characterization of essential graphs of 
DAGs will be useful. 

Theorem 5.1 (Roverato, Studeny [Rov05][Stu04a]). IfTLisa DAG then each con¬ 
nected component of TL* is decomposable. Moreover, TL* coincides with the essential 
graph of TL as defined in [AMP97] (see also Remark 2.13). 

For any DAG TL on the set of nodes [m], the standard imset for TL is an integer¬ 
valued function u ^ : 2^ —> Z, where 2 ^ is the set of all subsets of [m], defined 
by 

(6) Uu ■ &[m] — ^0 T ^ ^ i^Pnd) 

ie[m] 

where 5a ■ 2 ^ —» {0,1} satisfies 5a(B) = 1 if A = B and is zero otherwise. 
For example, it is easy to verify that all DAGs in Figure 1 give raise to the imset 
represented by Figure 4. 

Lemma 5.2 (Corollary 7.1, [Stu04b]). Let Q,TL be two DAGs. Then TL G (Q) if and 
only if u g = u n . 

The support of for a DAG TL has been described in [SV09] directly in terms of 
the essential graph. To provide this result we introduce some useful notions related 
to chain graphs. 

Definition 5.3. A set B C [m] of nodes in a chain graph TL is idle if i ■ ■ ■ j for all 
i , j G B ; and for every i G [m] \ B and every j G B, i —> j in TL. 

By [SRS09, Lemma 18] every chain graph has a unique maximal idle set of nodes 
(which may be empty), which we denote by idle (TL). The complement of the largest 
idle set is called the core of TL and denoted cor e(TL). Directly from the definition 
it follows that idle(77) is a union of connected components of TL. Therefore, the 
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Figure 4. The imset where PL is any of the three equivalent 
DAGs in Figure 1. 


core is also a union of connected components. The class of core-components, that is, 
components in PL contained in core(H) is denoted by ,% ore (PL). 

Lemma 5.4. If (PL) is a NF-CF then idle("H*) forms a clique, that is, all its nodes 
are connected by an undirected edge. 

Proof. Because there is a directed arrow from any node outside idle("H*) to any node 
in idle("H*), every component of PL* lies either inside or outside of idle(TG). Since 
all nodes in idle("H*) are linked, there is a meta-arrow between any two distinct 
components of idle^*) and each component is a clique. Without loss of generality 
pick T such that T' is the only child-component of T. First note that p*{T') flT = T 
forms a clique. Second, the parent-components of T' are T U p*{T). Indeed, if 
a component S, such that S =>- T', lies outside of idle("H*) then S C p*(T ) by 
definition. If S C idlc('H*) then S C p*(T ) because S and T are necessarily linked 
and T has no other children than T'. Thus, by Definition 2.7, T and T' can be legally 
merged, which contradicts the fact that PL* is essential. □ 

Note that idle("H*) is precisely the set of vertices i such that j. i = [m], where 
i/ = {./ : N*(i) C N*(j)}. 

From now on PL will always denote a DAG. By Theorem 5.1 each component 
T G P?core(PL*) induces a decomposable graph PL* T . We recall that a decomposable 
graph is an undirected graph with no induced cycles of size > 4. An alternative 
definition, that will be useful in this section, is that its maximal cliques can be 
ordered into a sequence Ci,... ,C P satisfying the running intersection property (see 
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[Lau96, Proposition 2.17]), that is 

(7) Vi > 2 3 k<i Si = Cin ^(J C^j C C k . 

By [Stu04b, Lemma 7.2] the collection of sets S) for 2 < i < m does not depend on 
the choice of ordering that satisfies (7). We call these sets separators of the graph. 
The multiplicity v(S) of a separator S is then defined as the number of indices i such 
that Si = S. This number also does not depend on the choice of an ordering that 
satisfies (7). 

By ^(T) we denote the collection of maximal cliques of 77]f, by S^{T) the collection 
of its separators, and by i't(S') the multiplicity of S G ^(T) in Lif. A set P C 
[m] is called a parent set in 1~L* if it is non-empty and there exists a component 
T G encore (77*) with P = p H *(T). The multiplicity r(P) of P is the number of 
T G 7/£ ore (77*) with P = pu*(T). The collection of all parent sets in 77* is denoted 
by & core (77*). Finally, by 7(77*) we denote the number of initial components of 77*, 
that is the components T G £L coie (Li*) such that pu*(T) = 0. 

We refer for the following result to [SV09, Lemma 5.1]. 

Lemma 5.5. Let 77* be the essential graph of a DAG Li. If core(77*) = 0 then 
uu = 0. //core(77*) ^ 0 then the standard imset for Li has the form 

Un — dcore(H*) ~ E E $CU Pn ,(T) + E E VT(S)5su Pn *(T) + 

Te-^coreCH*) Ce^(T) Te^coreCH*) S€^(T) 

+ t(P)5p + (7(77*) - 1)50. 

PePcore(W*) 

By Lemma 5.2 in [SV09], unless Li* is a complete graph, the terms in the above 
formula never cancel each other. In particular the support of up is the collection of 
all sets of the form: 

(i) the core of Li* 

(ii) C Up*(T) for T G T^core (77*) and C G V?(T) 

(hi) S Up*(T) for T G ^ core (77*) and S G y {T) 

(iv) P for P G P core (77*) 

The empty set may or may now appear in the support set of up but this does not 
play any role in the following arguments. 

Proposition 5.6. Let Li be a DAG. Then N*(i ) C N*(j ) if and only if i G A implies 
j G A for every A in the support of up. 

Proof. Lemma 5.5 gives the support of u-p in terms of 77*, see also items (i)-(iv) 
above. For the forward direction first note that if i G C then j G C U p*(T), which 
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follows immediately from i G N*(j). This implies that if i lies in the core then j 
also lies in the core. Suppose now that i G C U p*(T) for some T G ££ COTe (P*) and 
C G ^(T). If i G C then we have just shown that j G C U p*(T). If i G p*(T) then 
j G p*(T) because c*(i) C c*(j). The arguments for the subsets of type (iii) and (iv) 
above are the same. 

For the opposite direction first note that if i G A implies j & A for all A in the 
support of u-h then taking A = C U p*(T) where T is the connected component 
of i and C G ^(T) we find that either i — j or j —y i and hence i G N*(j). Let 
k G n*(i)Uc*(i). Suppose first that i—j. If k G n*(i) then k G n*(j). To see that take 
any CUp*(T) such that i,k G (7, which implies that j G C. Similarly, if k G c*(i) then 
k G c* (j ), which follows by considering P a parent set of the component containing 
k. Consequently N*(i ) C N*(j). The case j — > * is similar. □ 

Proposition 5.6 gives an efficient procedure of checking when N*(i ) C N*(j) with¬ 
out constructing the essential graph P*, which gives the description of G°. We 
present this procedure in the pseudocode below. 

Data: a DAG T~L = ([■ m\,E ) 

Result: the set of pairs (i,j) such that N*(i) C N*(j) 

initialization; 

for i —> j in do 

| add i to pn(j)] 

end 

«w(0) : = -1, u n ([m]) := 1, S = 0; 

for i = 1 to m do 

+ + u H {p H (i)) : - u H (p n (i) U i); 

add {pnif)} and {pn(.i) U to S; 

end 

forall the elements S of S do 

| if u-h(S) = 0 then remove S from S; 

end 

for i = 1 to 77i do 

I Si := {S G S : i G S}; 

end 

for i ■ ■ ■ j G E do 

| N*(i) C N*(j) if and only if £, C 

end 

Algorithm 1: The computation of G° for a DAG P 

In addition note that the size of the support set of u-u* is < 2m. The fact that it is 
< 2m+2 is obvious from (6). But also any initial vertex % in H will have pn(i) = 0 an d 
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hence —and S^u) will cancel each other. It follows that the number of operation 
to build construct G° is quadratic in m. In fact all loops are linear in m + \E\ apart 
from the penultimate one. 

The imset u-u gives in fact the complete description of the group G. 

Lemma 5.7. Let a be a permutation. Then a G G if and only if u-h = , where 

a ( u n )( S ) = m w (ct' 1 (S')). 

Consequently, by Theorem l.f we obtain the complete structure of G. 

Proof. This follows from the fact that u % is in a one-to-one correspondence with a 
DAG model of U. □ 

Lemma 5.7 does not provide an efficient algorithm to fold the automorphism group 
of Li *, which in general is a hard problem. 

6. Special graphs and small examples 

Some DAG models are equivalent to undirected graphical models, in which case 
we refer to [DKZ13, Section 7] for examples. To obtain a new set of examples we 
first consider two simple DAGs: the sprinkle graph in Figure 5 and the Verma graph 
in Figure 6. 


2 




FIGURE 5. The sprinkle graph on the left and its essential graph on 
the right. 


1 



Figure 6 . The Verma graph. 

The essential graph of the sprinkle graph is also given in Figure 5. There are no 
non-trivial equivalence classes and therefore Li* = Li*. The only nontrivial relation 
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between neighboring sets is iV*(5) C iV*(4), so the matrices in G° have only one 
non-zero off-diagonal element on position (5,4). The group of automorphisms of T~L* 
has only one non-trivial element which permutes 2 and 3. Hence matrices in G are 
in either of the two following forms: 


* 

0 

0 

0 

0 ' 


* 

0 

0 

0 

0 ' 

0 

* 

0 

0 

0 


0 

0 

* 

0 

0 

0 

0 

* 

0 

0 

and 

0 

* 

0 

0 

0 

0 

0 

0 

* 

0 


0 

0 

0 

* 

0 

0 

0 

0 

* 

* 


0 

0 

0 

* 

* 


The essential graph of the Verma graph Ji is equal to the Verma graph itself. All 
equivalence classes are singletons. Moreover, there is no two distinct vertices satisfy 
N*(i) C N*(j ) and hence G° is equal to the group of all invertible diagonal matrices. 
Since there are no non-trivial automorphisms of T~L then in fact the whole group G 
consists solely of diagonal matrices. 


bi 

b- 2 

h 

K 

FIGURE 7. The graph of the factor model. 

For a slightly more general example consider the DAGs defining factor models as 
given in Figure 7. We have N-^ibi) C Nu(ai) for every i,j and there are no other 
containment relations. The only non-zero off-diagonal elements of matrices in G° are 
in position ( a,i,bj ) for all i,j- For example if p — 2 and q = 3 then they are of the 
form 


* 

0 

* 

* 

* 

0 

* 

* 

* 

* 

0 

0 

* 

0 

0 

0 

0 

0 

* 

0 

0 

0 

0 

0 

* 


Any automorphisms of V. is a product of any permutation permuting {aq,..., a p } 
and any permutation permuting {6 j_,..., b q }. Consequently all matrices in G look like 
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the matrices in G° where the two diagonal blocks are replaced by arbitrary monomial 
matrices. 


Appendix A. Proof of Theorem 4.3 

To prove this theorem, we will use the following two lemmas, in which K is the 
concentration matrix of the model. 

Lemma A.l. Let A, B be subsets of [m] of the same cardinality satisfying j E A 
and i A and either j ^ B or else both i,j E B. If det K[A, B] is identically zero 
on the model but det K[A — j + i, B] is not, then E l3 ^ g. 

Proof. Recall that the one-parameter group / + tE i3 acts on K via 

K >->(/- tEji)K(I - tEij). 

In words, this matrix is obtained from K by adding a multiple of the i-th row to the 
j'-th row and adding a multiple of the i-th column to the j-th column. Now consider 
the effect of this operation on K[A,B]. Since either j 0 B or else both i,j E B, 
adding the i-th column to the j'-th has either no effect on K[A,B] or else is just 
an elementary column operation on K[A, B\. This means that it does not affect the 
rank of K[A, B\. On the other hand, since det K[A — j + i, B] is non-zero, the rows 
of K[A — j,B\ are linearly independent, and since det K[A,B] is zero, the j'-th row 
K[j, B] lies in the span of the rows of K[A — j , B\. This is not true for the i-th row 
K[i, B], hence the A x R-submatrix of K + tE 3i K + tKE l3 has full rank for generic 
K. This means that / + tE,j does not preserve the model, hence it does not lie in 
the group G. □ 

Lemma A.2. Let A, B be subsets of [m] of the same cardinality satisfying j E Ap\B 
and and i A LI B. If det K[A,B] is identically zero but det K[A — j + i, B] + 
det K[A, B — j + i] is not, then E i3 £ g. 

Proof. Since K[A, B] = 0, E tJ E g only if the determinant of the ( A , R)-submatrix of 
K t := (/ — tEji)K(I — tEij) is zero. To show that it is not zero it suffices to show that 
the the linear term of t does not vanish. To study this linear term, we alternatively 
study the linear term of (/ — sEji)K(I — tE^) further specializing to s — t. Because 
Eij has rank 1, the determinant of the (A, R)-submatrix of (/ — sEjf)K(I — tEij) is 
a polynomial of order two in s, t. To find its coefficient of the linear term s we can 
set t — 0. Matrix (/ — sEjf)K is obtained by adding a multiple of the i-th row to the 
j-th row. Suppose that the elements of A are a\ < a 2 < ■ ■ ■ < ad and the elements 
of B are b\ < 62 < • • ■ < bd- Let 1 < k < d be such that j = a*,. The determinant 
if its (A, R)-submatrix can be computed by expanding along the fc-th row (which 


22 


JAN DRAISMA AND PIOTR ZWIERNIK 


corresponds to the j-th row of K ): 

d 

det((J - sE 3i )K)[A, B) = J](-l ) k+l (K jbl - sK ibl ) det K[A -j,B- b{\ = 

1=1 

= det K[A, B] — s det K[A — j + i,B]. 

Similar computations for the coefficient of t give 

det (K(I - tEij))[A, B} = det K[A, B] - t det K[A, B - j + i\. 

Hence the coefficient of t in the determinant of K t [A, B] is — det K[A — j + i, B] — 
det K [A, B — j + i]. If this sum does not identically vanish on the model then 

Eij ^ g. □ 

Lemma 4.2 gives one direction of the proof of Theorem 4.3; we need only prove 
that if i ^4 j and Nu(i) <2 A%(j), then E tJ ^ g. First of all, if there is no cup from 
j to i, then K\j,i\ is identically zero, while K[i,i\ is not. Hence <£. g (this is 
the special case of Lemma A.l with A = {j} and B = {z}). Thus in what follows 
we may assume that there do exist cups from j to i. We treat the various types of 
cups from j to i separately; in each case, we assume that cups of the previous types 
do not exist. Before we get going, we remark that, since there are no flags, for any 
cup (/, h, k , l ) with / — > h also (/, fc, k , l) is a cup. The following lemma will be also 
useful. 

Lemma A.3. Let u be a vertex in a NF-CG Ft. Let D be the set of children of u 
together with all their descendants. Then for every vertex v ^ D U {«} such that 
there is no link between u and v we have det K[D U {u}, D U {w}] = 0. 

Proof. By Corollary 3.5 it is enough to show that there is no self-avoiding cup system 
from D U {«} to D U {u}. It is clear that the second element of every cup starting 
in d e D needs to lie in D just because it is either equal to d or it is equal to d' 
such that d —> d! in Li. Also every cup from u needs to have its second entry in D. 
Indeed, let (w, Z 2 , h, U ) be such a cup. The node / 2 is either equal to u or it is a child 
of u, in which case it lies in D. So suppose that l 2 = u and show that this leads 
to a contradiction. If l 2 = u then Z 3 is either u or a neighbor of u. If Z 3 = u then 
I 4 must be a parent of u , which cannot be a vertex of D (because otherwise there 
is a semi-directed cycle in PL) and it cannot be v because there is no arrow v — » u 
(by assumption). If Z 3 e n-u(u) then Z 4 must be a parent of / 3 and by the no flag 
assumption also a parent of u. This situation is also impossible because I 4 cannot lie 
in D U {u}. Hence, by the pigeon hole principle, in any cup system from D U {u} to 
D U {u}, two of the elements after one step coincide, and this proves the claim. □ 

In what follows we assume that PL is essential. 
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I. Vertex i lies in n-^(j) U c-^ (j ). In that case there must exist 

l E ( n H (i ) U c n (i)) \ (■ n n (j) U c H (j)). 

Let D denote the set of all children of / together with their descendants. We have 
i,j £ D and thus A := D+j and B D + l have the same cardinalities. By Lemma 
A.3 with u — l, v — j we have det K[A, B] = 0. On the other hand, there does exist a 
self-avoiding cup system from A— j + i to B that links i directly to l without crossing 
D and each d e D — j to itself via (d, d, d, d) and hence det K[A — j + i, B] ^ 0 by 
Corollary 3.5. Now E VJ ^ g by Lemma A.l. 

II. There is no arrow i —> j. In that case let D be the set of all children of i 
together with their descendants. Set A : = D + j and B D + i. By Lemma A.3 
det K[A, B] = 0. But clearly det K[A — j + i, B] = det K[B, B] ^ 0. 

Mid-proof break. We pause a moment to point out that we have used that H 
has no flags, but not yet that it is essential. This will be exploited in the following 
arguments. Indeed, in the remaining cases, there must be an arrow i -A j. This 
arrow must be essential, hence either the parents of j in the undirected component 
T of i do not form a clique, or else one of {i,j} has a parent outside T that is not a 
parent of the other. We deal with these cases as follows. 

III. There is an arrow k —>■ j with k in the component of i at distance 

at least 2. In that case let D be the set of all children of i together with their 
descendants. Set A := D + k and B := D + i. By Lemma A.3 det K[A, B] = 0. But, 
as in the first case, det K[A — j + i, B] ^ 0 because there is a self-avoiding cup system 
from A — j + % to B given by (d, d, d, d) for d G D — j + i and k). Again, we 

conclude that Eij g. 


IV. There is an induced subgraph like in Figure 8a. Let D be the set of all 

children of k together with their descendants. Set A = D + k and B = D + l and 
note that both A and B contain j. We again have det K[A, B] = 0 by Lemma A.3. 
However, both det K[A — j + i, B] and det K[A, B — j + i\ are nonzero. Even more: 
the sum of these two determinants is also nonzero because det K[A — j + i, B\ has 
a monomial that does not appear in det K[A,B — j + i]: consider the cup system 
from A — j + i to B given by (z, i , l, /), (k, j, j, j ) and (d, d, d, d) for all d G D — j. By 
Proposition 3.4 this system corresponds to a monomial in det K[A — j +1 , B]. On the 
other hand this monomial cannot appear in det K[A,B — j+ i] because it contains 
only one element of A, namely A kj, and only one off-diagonal element of 0, namely 
c on. This means that it must correspond to a cup system between A and B — j + i 
that contains only one undirected edge i — l and onearrow k —> j. However any cup 
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Figure 8. Some special induced subgraphs considered in the proof. 


from i to A must contain either an arrow i —> p for some p E D or an undirected 
edge i — k. By Lemma A.2 we conclude that E Vj j g. 

V. There is an arrow k -> j with k j T and no arrow between k and i. 

So we have the induced subgraph i —» j k. Let D be the set of all children of 

i together with their descendants. Set A := D + k and B := D + i. By Lemma 

A.3 det K[A,B\ = 0. On the other hand, det K[A — j + i, B] 0, because of the 
self-avoiding cup system from A — j + i to B consisting of (k,j,j,j) and (i,i,i,i) and 
(d,d,d,d) for all d e D — j. Again, we may apply Lemma A.l, this time with i,j 
both in B, to conclude that E t] (jL g. 

VI. There is an arrow l —> i with no arrow from l to j. Pictorially, we have 

l —> i —> j. Let D be the set of children of j together with all their descendants. 
Set A = D + j and B — D + l. By Lemma A.3 we have K[A, B) = 0. However, 
K[A — j + i, B\ 0 and hence E l3 j g by Lemma A.l. 

VII. There is an arrow i —> l and l —> j. Without loss of generality we can 

assume that l is minimal in the sense that if % —>■ l', V —> j then there is no arrow 
from l to l 1 . Since B is essential then l —> j is an essential arrow. This implies one 
of the following possibilities: 

(i) There exists k in the component of / with distance at least two to l and with 
k J 

(ii) There is an induced subgraph like in Figure 8b. 

(iii) There are arrows l —> k, k —> j 

(iv) There is an arrow k —y l and no arrow from k to j. 

VII. (i). In this case we have an induced subgraph k —* j <— l. Let D be the set 
of children of l and all their descendants. Set A — D + l and B = D + k. The 
argument that det K[A, B] = 0 is the same as in the previous cases. By Lemma 
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A.2 Eij ^ g because det K[A — j + i, B] + det K[A, B — j + i) ^ 0. To verify this 
last statement note that by Proposition 3.4 det K[A — j + i, B] contains a monomial 
corresponding to the cup system (d,d,d,d) for d e D — j, (i,k,k,k) and (/, j, j, j). 
This monomial contains A ki, Xij and no off-diagonal a/s. There is no cup system 
from A to B — j + i that uses only k —> j and l —> j and hence this monomial does 
not appear in det K[A, B — j + i\. 

VII. (ii). Let D be the set of children of p and all their descendants. Set A = D + p 
and B = D+r. Again det K[A, B] = 0 but det K[A—j+i, £>]+det K[A, B—j+i] ^ 0. 
For this, we note that det K[A — j + i, B] contains a monomial corresponding to the 
cup system (d,d,d,d) for d e D — j, (i,r,r,r) and (p,j,j,j), which does not appear 
in det K[A, B—j + i]. Now ^ 0 by Lemma A.2. 

VII. (iii). Note that in this case no link between i and k is possible (by maximality 
of l and no semi-directed cycle assumption). But then E tJ ^ g by Case V. 

VII. (iv). Note that in this case by case VI. the arrow k —» i is impossible and thus 
we have either i — k, i —» k are there is no link between them. The induced subgraph 
is given in Figure 8c, where the dashed edge indicate the three possibilities for the 
link between i and k. Let D be the set of children of j together with all descendants. 
Set A = D+ j,B = D + k. Again by Lemma A.3 we have that det K[A, B] = 0. 
Moreover, det K[A — j + i, B] ^ 0. Now ^ g by Lemma A.l. This exhausts all 
possible cases and hence finishes the proof. 
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